Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

Make sure blocks don't overlap to avoid outsider chunks #348

Merged
merged 4 commits into from
Jul 5, 2018

Conversation

BenoitKnecht
Copy link
Contributor

@BenoitKnecht BenoitKnecht commented Jun 13, 2018

Fixes #347

Copy link
Contributor

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will not fix our case. I explained that in the comment here: #347 (comment)

EDIT: nvm, I confused the rangeTimestamp that it has nothing to chunk range -> double reviewing again.

db.go Outdated
@@ -550,7 +550,7 @@ func (db *DB) reload(deleteable ...string) (err error) {
if len(blocks) == 0 {
return nil
}
maxt := blocks[len(blocks)-1].Meta().MaxTime
maxt := blocks[len(blocks)-1].Meta().MaxTime + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that really needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it is, because if block A is 0-10 and block B is 10-20, sample t=10 must be in both.

However, this is quite risky change, affecting lots of things. Wonder if we can abort this and try to make compactor deduplicating these chunks correctly. Looking into that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, talked offline with @fabxc and in fact we can kill this change and leave just adjusting the nextAt change.

This is because:

if a block has minTime: 10 and maxTime: 20 the intend was for this to be interval [10,20)

db.go Outdated
@@ -807,7 +807,7 @@ func (db *DB) Querier(mint, maxt int64) (Querier, error) {

func rangeForTimestamp(t int64, width int64) (mint, maxt int64) {
mint = (t / width) * width
return mint, mint + width
return mint, mint + width - 1
Copy link
Contributor

@bwplotka bwplotka Jun 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mhm can we instead change only here:
in db.compact():

	// Wrap head into a range that bounds all reads to it.
		head := &rangeHead{
			head: db.head,
			mint: mint,
			maxt: maxt-1, // block range is always [minT, maxt)
		}

db_test.go Outdated
@@ -1098,3 +1100,51 @@ func TestOverlappingBlocksDetectsAllOverlaps(t *testing.T) {
{Min: 8, Max: 9}: {nc1[8], nc1[9]}, // 7-10, 8-9
}, OverlappingBlocks(nc1))
}

func TestChunkAtBlockBoundary(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comment that this is regression test for issue #347

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise the test is 👍

@BenoitKnecht
Copy link
Contributor Author

@Bplotka Thanks for your review. So if I understand correctly, you want me to implement the first alternative I described here:

A possible way to fix this would be to say that block.MaxTime is an exclusive upper bound (so a chunk would only be included if chunk.MinTime < block.MaxTime && block.MinTime <= chunk.MaxTime).

The other option would be to make sure that blocks don't overlap by the current definition of intervalOverlap(), i.e. make sure that blocks[1].MinTime = blocks[0].MaxTime + 1.

The clean way to do that (in my opinion) would be to modify intervalOverlap() so that it interprets one of the intervals as being open (i.e. [a, b) instead of [a, b]). I didn't opt for that solution because it seemed to have a bigger potential for side-effects.

For instance, when looking at a chunk, its [c.MinTime, c.MaxTime] range includes c.MaxTime, but for a block, [b.MinTime, b.MaxTime) would be interpreted as excluding b.MaxTime. That's confusing, and it means that intervalOverlap() should behave differently if comparing chunk intervals or block intervals.

So I can try to rework this PR to interpret block intervals as [a, b), but chunk intervals would still be interpreted as [a, b], which will most likely cause confusion down the line. Do you still think it's a better approach?

@bwplotka
Copy link
Contributor

That's confusing, and it means that intervalOverlap() should behave differently if comparing chunk intervals or block intervals.

Unfortunately that is the case now. There is small intentional inconsistency here that we need to address in this PR IMO.

               // Wrap head into a range that bounds all reads to it.
		head := &rangeHead{
			head: db.head,
			mint: mint,
			// Block range is always [minT, maxt), so we need to explicitly exclude maxt samples.
			maxt: maxt-1,
		}

..is one way of doing this.

@bwplotka
Copy link
Contributor

So I can try to rework this PR to interpret block intervals as [a, b), but chunk intervals would still be interpreted as [a, b], which will most likely cause confusion down the line. Do you still think it's a better approach?

From what I got from @fabxc this is the best approach I think.

@BenoitKnecht BenoitKnecht force-pushed the fix-block-boundaries branch 2 times, most recently from e663c7f to 14b320b Compare June 13, 2018 13:08
Copy link
Contributor

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just minor nits.

db.go Outdated
maxt: maxt,
// We remove 1 millisecond from maxt because block
// intervals are half-open: [b.MinTime, b.MaxTime). But
// chunk intervals are closed ([c.MinTime, c.MaxTime]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

([c.MinTime, c.MaxTime]) -> [c.MinTime, c.MaxTime]

Let's remove brackets to not confuse here (:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done.

for p.Next() {
err = r.Series(p.At(), &lset, &chks)
testutil.Ok(t, err)
for _, c := range chks {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we at least count chunks here? We should have 2 blocks (?) one chunk in each I think. That is the only thing I wanted to add in my extension. (wanted to assert on exact chunk time range, but just counting these would be nice as well)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can assert the number of chunks per block, yes.

Not sure about checking the number of blocks, because it depends on this condition in db.compact():

if db.head.MaxTime()-db.head.MinTime() <= db.opts.BlockRanges[0]/2*3 {
        break
}

which could change and break the test for no reason. (In fact, we only have one block here because of this condition.)

Copy link
Contributor

@bwplotka bwplotka Jun 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a priority to test if actually, next block will include the chunk we ignore with

head := &rangeHead{
 			head: db.head,	 			head: db.head,
 			mint: mint,	 			mint: mint,
			// We remove 1 millisecond from maxt because block
			// intervals are half-open: [b.MinTime, b.MaxTime). But
			// chunk intervals are closed: [c.MinTime, c.MaxTime];
			// so in order to make sure that overlaps are evaluated
			// consistently, we explicitly remove the last value
			// from the block interval here.
			maxt: maxt - 1,
 		}	 		}

Can we somehow make it test resilient on change here?

if db.head.MaxTime()-db.head.MinTime() <= db.opts.BlockRanges[0]/2*3 {
        break
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, I'll look into it.

@bwplotka
Copy link
Contributor

@fabxc @gouthamve @krasi-georgiev can we have review on this? It is rdy to go in my opinion (despite my minor comments)

@BenoitKnecht
Copy link
Contributor Author

I noticed that we also need to change the following in order to be consistent:

diff --git a/db.go b/db.go
index bd3fdf4..6ddbbd3 100644
--- a/db.go
+++ b/db.go
@@ -785,7 +785,7 @@ func (db *DB) Querier(mint, maxt int64) (Querier, error) {
 
        for _, b := range db.blocks {
                m := b.Meta()
-               if intervalOverlap(mint, maxt, m.MinTime, m.MaxTime) {
+               if intervalOverlap(mint, maxt, m.MinTime, m.MaxTime-1) {
                        blocks = append(blocks, b)
                }
        }
@@ -828,7 +828,7 @@ func (db *DB) Delete(mint, maxt int64, ms ...labels.Matcher) error {
 
        for _, b := range db.blocks {
                m := b.Meta()
-               if intervalOverlap(mint, maxt, m.MinTime, m.MaxTime) {
+               if intervalOverlap(mint, maxt, m.MinTime, m.MaxTime-1) {
                        g.Go(func(b *Block) func() error {
                                return func() error { return b.Delete(mint, maxt, ms...) }
                        }(b))

Rather than just add this as a commit, I'm wondering if it wouldn't be clearer to replace 1afe3db with

diff --git a/head.go b/head.go
index 2c7c7ec..ecfca36 100644
--- a/head.go
+++ b/head.go
@@ -742,7 +742,7 @@ func (h *headChunkReader) Chunk(ref uint64) (chunkenc.Chunk, error) {
        s.Unlock()
 
        // Do not expose chunks that are outside of the specified range.
-       if c == nil || !intervalOverlap(mint, maxt, h.mint, h.maxt) {
+       if c == nil || !intervalOverlap(mint, maxt, h.mint, h.maxt-1) {
                return nil, ErrNotFound
        }
        return &safeChunk{
@@ -849,7 +849,7 @@ func (h *headIndexReader) Series(ref uint64, lbls *labels.Labels, chks *[]chunks
 
        for i, c := range s.chunks {
                // Do not expose chunks that are outside of the specified range.
-               if !intervalOverlap(c.minTime, c.maxTime, h.mint, h.maxt) {
+               if !intervalOverlap(c.minTime, c.maxTime, h.mint, h.maxt-1) {
                        continue
                }
                *chks = append(*chks, chunks.Meta{

It feels more systematic, because don't treat rangeHead and regular blocks differently. Everytime intervalOverlap() compares a block (or head) range with any other range, we remove 1 from the block's MaxTime to make the interval half-open.

What do you think?

@bwplotka
Copy link
Contributor

bwplotka commented Jun 14, 2018

Is intervalOverlap really used only against chunks? This change would be really hard to figure out if it is not breaking anything else, but we can try to dig it through (querier uses it as well)

@BenoitKnecht
Copy link
Contributor Author

OK, unless someone else objects, I'll update this PR with the modified intervalOverlap() arguments.

@bwplotka
Copy link
Contributor

bwplotka commented Jun 14, 2018

The only arguments towards rangeHead is that this is a single hacky point in single place that is responsible for writing block, not two somewhere buried deep down in overall chunk retrieval. But it's up to you .

@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Jun 14, 2018

I am still digging through the code to understand why block time range doesn't match the chunk time range.

this PR will fix the current issue, but just wondering if this inconsistency will bite us again.

it would be great if @fabxc or @gouthamve can give us some hints.

@BenoitKnecht
Copy link
Contributor Author

Alright, talking with @Bplotka on Slack, I can get behind the idea of changing rangeHead and not treating head intervals like block intervals when we call intervalOverlap().

But we do need to pass m.MaxTime-1 to intervalOverlap() in db.Querier and db.Delete. The idea being, if your blocks are [a, b), [b, c), [c, d) and you make a query for the interval [b, c], the current implementation will return a querier over all three blocks; but you only need [b, c) and [c, d), because [a, b) doesn't contain b.

@BenoitKnecht
Copy link
Contributor Author

(But I can submit that fix in a different PR, if it simplifies review.)

@bwplotka
Copy link
Contributor

bwplotka commented Jun 14, 2018

why block time range doesn't match the chunk time range.

yea it is big unknown.

Yea, agreed. These 3 places (rangeHead, db.Delete, db.Querier) are where we are wrongly assuming [t1, t2] range for block time ranges whereas block time range is always [t1, t2).

I think changing intervalOverlap inside Series and Chunk so basically:

treating head intervals like block intervals when we call intervalOverlap().

...is not safe, because the TSDB code is definitely not prepared for this (might more tricky to find further bugs)

@bwplotka
Copy link
Contributor

No, this PR is fine for it I think

@BenoitKnecht
Copy link
Contributor Author

No, this PR is fine for it I think

OK, I'll push the fix and a regression test.

@gouthamve
Copy link
Collaborator

I don't remember if anything in particular made us choose this. I think it should aligned with what prometheus expects when a query comes in for a to b.

@gouthamve
Copy link
Collaborator

Prometheus expects it to be [t0, t1] and I think for the sake of consistency, let us keep it inclusive everywhere.

I am not averse to: maxt: maxt - 1, when creating the persisted block. i.e, making the maxt of a block look like xxxx999

@bwplotka
Copy link
Contributor

yea, @krasi-georgiev, I got clear message also that this inconsistency is only for esthetic reasons (no [100000-299999]).

This means that in theory, we could change block range to be inclusive [t1 t2] just fine, but it won't be a trivial change... For example, we really need backward compatibility here, so we need to make compactor being able to handle [ ) as well as [ ] blocks..

@BenoitKnecht
Copy link
Contributor Author

@gouthamve I also favor this solution, I agree it's more consistent.

@Bplotka I think we can handle backwards compatibility in the compactor by checking if the blocks being compacted overlap (when interpreted as [a, b] blocks), and if so dropping any chunk that starts at b. We may also want to check the last block for an outsider chunk, but this is where things become ambiguous, and the outsider chunk won't hurt until we compact the block further.

@BenoitKnecht
Copy link
Contributor Author

@Bplotka I don't think it's really an issue though. Blocks that have been created so far are [a, b] blocks, not [a, b). They were intended to be [a, b) blocks, but that's not how they were written on disk, and it's not how they were defined in code either. So for all intents and purposes, blocks created so far are [a, b] blocks.

The issue with those blocks only manifests itself when they're contiguous and compacted together, because if there was a chunk starting at b and you compact [a, b] with [b, c], you end up with that chunk being duplicated.

But taken individually, or if they don't overlap (such as in the non-contiguous case you mentioned), it's perfectly correct to treat them as [a, b] blocks; they do contain data from that range, including b.

So I don't think that the proposal to keep treating blocks as [a, b] intervals (but making sure that they don't overlap by writing them as [a, b-1] from now on) is a bigger change than starting to treat all blocks as [a, b) intervals.

@krasi-georgiev
Copy link
Contributor

I am in big favour of making blocks and chunk ranges consistent.
Since this doesn't require an urgent fix lets spend enough time to make it right.

@BenoitKnecht
Copy link
Contributor Author

@gouthamve @krasi-georgiev @Bplotka I've implemented the fix as discussed: blocks are closed intervals that don't overlap. It'd be great if you could take a look when you have the time.

@krasi-georgiev
Copy link
Contributor

thanks testing it locally and it works as expected so far.

should also update OverlappingBlocks to detect overlaps properly.
https://github.com/prometheus/tsdb/blob/c848349f07c83bd38d5d19faa5ea71c7fd8923ea/db.go#L611
I think it is here
https://github.com/prometheus/tsdb/blob/c848349f07c83bd38d5d19faa5ea71c7fd8923ea/db.go#L636

should also make sure that we convert all existing metas to remove that extra ms at the end
to avoid running the conversion on every start maybe change the meta version=2 so that at startup it will read all meta.json of version 1 convert them to version 2 and than run the new OverlappingBlocks func.

@fabxc
Copy link
Contributor

fabxc commented Jun 27, 2018

Thanks for all the debugging work you've put into this.

Looking at the current fix, we account for that in rangeForTimestamp to make it return the closed interval for a given timestamp by doing -1 on the max time – makes sense. That will affect the actual interval written to meta.json.

What will happen if I now have a TSDB that has blocks written the old way and this new way?
It seems we would have to account for that difference.
Generally the change would require a bump of the meta.json format to 3 (2 has been used in an erroneous format bump before and must be skipped). Then we could set block.MaxTime-- when loading a block with meta version 1 and we should be good.

Please be aware this change makes debugging harder since the new maximum timestamps are hard to parse for humas. Tooling can help here I'm sure.

@fabxc
Copy link
Contributor

fabxc commented Jun 27, 2018

Discussion on whether half open or closed interval are better for this particular case, I do want to point out that this change is disruptive (esp. with the meta version change) and may cause other issues where we've forgotten the -1 change to be considered.

From what I understand currently chunks boundaries are in the right place. The only problem being that when compacting from the head, we consider one chunk too much. Truncation works correctly as well since the chunk won't be dropped from the head and will be found in the next compacted block as well.

The only error would thus be that we consider one chunk too much. Would it potentially be safer to do nothing but change this single line to maxt-1?

https://github.com/prometheus/tsdb/blob/40766622eeedfcd6287feaa73c9ba8c72d60f1c9/db.go#L380-L385

It's just a thought I had, please triple check me on it and let me know whether that's viable.
I missed that this is essentially what Bartek suggested further up quite a while ago.

One thing worth noting is that block ranges and chunk ranges are semantically different things. The chunk range describes the lowest and highest timestamp of actually existing samples.
The block range specifies an abstract range in which samples may exist. It doesn't imply mint and maxt have a real sample in any case.

@gouthamve
Copy link
Collaborator

The block range specifies an abstract range in which samples may exist.

Which essentially means block range is both included right: [ ]?

@fabxc
Copy link
Contributor

fabxc commented Jun 27, 2018

No, that's up to how you define it and right now our definition is [ ). And not changing that currently appears much safer from my point of view. Regardless of how one may judge the initial decision.

@bwplotka
Copy link
Contributor

bwplotka commented Jun 27, 2018

Yea, I need to agree with Fabian here.

We were already on the same page that some change like this (some comments above):
#348 (comment)
(and same thing as @fabxc proposed)

...will fix this issue entirely. However, we tried to understand what is the point of inconsistency and if it is worth to make it consistent.

After the research, it is clear in my opinion, that there is a huge risk and effort in changing this inconsistency (making blocks range inclusive [ ]) and minimal (if any) actual benefit of having consistent ranges.

Or am I missing these benefits? Can we enumerate those?

@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Jun 27, 2018

One thing worth noting is that block ranges and chunk ranges are semantically different things. The chunk range describes the lowest and highest timestamp of actually existing samples.
The block range specifies an abstract range in which samples may exist. It doesn't imply mint and maxt have a real sample in any case.

if the block range specifies an abstract range in which samples may exist having an overlap even by a 1 microsecond is still an overlap, no?
Making it consistent should make it easier to understand and troubleshoot bugs.

Please be aware this change makes debugging harder since the new maximum timestamps are hard to parse for humas. Tooling can help here I'm sure.

do you mean the 999 at the end would make it more difficult to read or something else? I always use some online converter even with the current timestamps so the 999s would make it more difficult for me.

@bwplotka
Copy link
Contributor

Making it consistent should make it easier to understand and troubleshoot bugs.

Maybe a little bit. But changing this will introduce thousands of other issues and debug problems (mix of "legacy blocks" and "new blocks").

@krasi-georgiev
Copy link
Contributor

Maybe a little bit. But changing this will introduce thousands of other issues and debug problems (mix of "legacy blocks" and "new blocks").

Isn't this why we have the Meta file versions? Start Prom , read the Meta versions and convert where needed , just few lines of code, no?

@fabxc
Copy link
Contributor

fabxc commented Jun 28, 2018

Yes, we have versions so that we can make changes where necessary. Doesn't mean we should do it whenever possible.

Start Prom , read the Meta versions and convert where needed , just few lines of code, no?

The concern is exactly that things aren't that easy in practice.

@BenoitKnecht
Copy link
Contributor Author

All right, so it looks like the majority leans towards considering blocks as [ ) intervals, as it's less likely to introduce new bugs and should fix the initial issue. Thank you all for your thoughtful comments!

I'm going to push that implementation later today.

Block intervals are bound by `block.MinTime`, `block.MaxTime`, but they
define a half-open interval: `[block.MinTime, block.MaxTime).

However, when deciding if a chunk was part of a block or not, the
`intervalOverlap()` function would consider both the chunk and the block
intervals as being closed.

Rather than modify the login in `intervalOverlap()`, we explicitly
remove the last value from the interval when reading from head to
persist blocks.

Signed-off-by: Benoît Knecht <benoit.knecht@fsfe.org>
Signed-off-by: Benoît Knecht <benoit.knecht@fsfe.org>
Blocks are half-open intervals [a, b), while all other intervals
(chunks, head, ...) are closed intervals [a, b].

Make that distinction explicit by defining `OverlapsClosedInterval()`
methods for blocks and chunks, and using them in place of the more
generic `intervalOverlap()` function.

This change also fixes `db.Querier()` and `db.Delete()`, which could
previously return one extraneous block at the end of the specified
interval.

Signed-off-by: Benoît Knecht <benoit.knecht@fsfe.org>
Due to the way blocks used to overlap by 1 millisecond (see prometheus-junkyard#347), when
requesting a 2-hour interval starting at `blocks[1].MinTime`, the
`Querier` would consider three blocks: `blocks[0]`, `blocks[1]` and
`blocks[2]`, because `blocks[0].MaxTime` and `blocks[2].MinTime` were in
that interval.

However, if the blocks don't overlap, only two blocks should be
returned: `blocks[1]` and `blocks[2]`. This test ensures that it's
indeed the case.

Signed-off-by: Benoît Knecht <benoit.knecht@fsfe.org>
@BenoitKnecht
Copy link
Contributor Author

I've pushed the new code. I tried to make the distinction between block intervals and chunk intervals as explicit as possible by replacing the generic intervalOverlap() function with OverlapsClosedInterval() methods on the block and chunk objects.

Let me know what you think.

Copy link
Contributor

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic makes sense to me 👍

But methods name are confusing bit. Can we work on this a little bit?

@@ -539,6 +539,13 @@ func (pb *Block) Snapshot(dir string) error {
return nil
}

// Returns true if the block overlaps [mint, maxt].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit confused. You have 2 method name OverlapsClosedInterval and comment Returns true if the block overlaps [mint, maxt]. but with different logic.

The inner comments and logic make sense though. I think we could leave it as methods and name it just isOverlapping (or ...halfOpen.. for block and ...closed... for chunk) and move inner comments to method comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods are good place to explain the difference in time ranges between block and chunks maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can see why you find it confusing. The thing is that the argument of the OverlapsClosedInterval() method is always a closed interval; the difference is in the object calling the method.

In the case of block.OverlapsClosedInterval(mint, maxt), we compare the half-open [block.MinTime, block.MaxTime) with the closed [mint, maxt]; and in the case of chunk.OverlapsClosedInterval(mint, maxt), we compare the closed [chunk.MinTime, chunk.MaxTime] with the closed [mint, maxt].

Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you're right, I should expand on the difference between block and chunk intervals in those method descriptions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, but one is Closed, second is Half-Closed interval, right?

Copy link
Contributor

@bwplotka bwplotka Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or even Overlaps .. basically we want to know if mint and maxt overlaps with block or chunk. And it gives us right answer (e.g: if mint is touching block.maxt - there is NO overlap)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% convinced it's clearer to just call it Overlaps. I wanted to highlight the fact the mint and maxt arguments refer to a closed interval to avoid someone using it to compare a block with another block, in which case [mint, maxt) would be a half-open interval, whereas the method expects it to be closed.

That being said, if it's the only thing keeping this PR from being merged, I'm happy to make the change.

Copy link
Contributor

@bwplotka bwplotka Jul 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We need someone from maintainers to approve this PR anyway, so maybe someone can put his view on this as well, I don't want to block this PR either (:
  2. If I do block1.Overlaps(block2.min, block2.max) is there anything confusing by the fact that it will check overlaps using [a, b) because blocks are like this? I think having something like block1.OverlapsClosedInterval(block2.min, block2.max) would be confusing for future readers of this code, because it is saying closed but actually it is half open overlap logic. But I might be wrong here!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But block1.OverlapsClosedInterval(block2.min, block2.max) would return the wrong result, because [block2.min, block2.max) isn't a closed interval. You would get block1.min <= block2.max && block2.min < block1.max instead of the correct value for two [a, b) intervals, which is block1.min < block2.max && block2.min < block1.max.

That's why I wanted the method name to explicitly state that it requires a closed interval as its argument; it doesn't work on other types of intervals.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got you, agree (:

@brian-brazil
Copy link
Contributor

What do we need to do to get this in?

@bwplotka
Copy link
Contributor

bwplotka commented Jul 4, 2018

@brian-brazil what do you mean? From my perspective, the current code (except discussion around method naming) is what we need to fix #347

@brian-brazil
Copy link
Contributor

This code review appears to be ongoing, so I'm wondering what changes (if any) the maintainers are looking for.

Copy link
Contributor

@fabxc fabxc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for all the debugging and being patient with finding the right solution.

@fabxc fabxc merged commit 77db94c into prometheus-junkyard:master Jul 5, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants