Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

block: Handle gracefully extra duplicated chunk produced by TSDB issue. #375

Merged
merged 1 commit into from
Jun 15, 2018

Conversation

bwplotka
Copy link
Member

TSDB issue: prometheus-junkyard/tsdb#347

How we handle it?

  • segregate this issue in special stat entry in verifier.
  • auto-fix broken plan block before thanos compaction.
  • adding repair job to run offline batch repair for indivdual blocks.

NOTE: At this point we have no power of fixing the bug when someone uses local compaction ):

Fixes #354

PTAL @BenoitKnecht
Signed-off-by: Bartek Plotka bwplotka@gmail.com

@BenoitKnecht
Copy link

Looks good! The tests are failing, but it looks like it's due to the changes you made to some function signatures, nothing critical. I just have a few minor comments.

for i, bid := range *verifyIDWhitelist {
id, err := ulid.Parse(bid)
if err != nil {
return errors.Wrap(err, "not valid ULID found in --id-whitelist flag")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"not valid" → "invalid"


n := i.OutsideChunks - (i.CompleteOutsideChunks + i.Issue347OutsideChunks)
if n > 0 {
errMsg += fmt.Sprintf("found %d chunks non-completely outside the block time range.", n)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"non-completely" → "partially"

// GatherIndexIssueStats returns useful counters as well as outsider chunks (chunks outside of block time range) that
// helps to assess index health.
// It considers https://github.com/prometheus/tsdb/issues/347 as something that Thanos can handle.
// See Stats.AcceptableOutsiders for details.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you renamed Stats.AcceptableOutsiders to Stats.Issue347OutsideChunks.

if c.MinTime > maxTime || c.MaxTime < minTime {
stats.CompleteOutsiders++
stats.CompleteOutsideChunks++
} else if c.MinTime == maxTime && c.MaxTime > maxTime {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it can actually happen, but in theory a chunk with a single value could have c.MinTime == c.MaxTime, and if it's also at the very end of the block (c.MinTime == maxTime), it wouldn't be picked up by this condition.

I think you just need to test if c.MinTime == maxTime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can agree with it, but single sample chunk is very sad chunk (: But can happen probably.

return true, nil
}
return false, nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see why you would collect separate stats for outside chunks and complete outside chunks, but when it comes to repairing blocks, they're really the same thing.

I mean if you have two intervals, block = [a, b) and chunk = [c, d], they don't overlap at all even if b == c. So you could have a single ignore function:

func IgnoreOutsideChunk(mint, maxt int64, _ *chunks.Meta, curr *chunks.Meta) (bool, error) {
	if curr.MinTime >= maxt || curr.MaxTime < mint {
		return true, nil
	}
	return false, nil
}

If you don't merge those two functions, then I think you should change the condition in IgnoreIssue347OutsideChunk() to

if curr.MinTime == maxt {
	return true, nil
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea will do latter. Can agree that "complete" and "near" is kind of the same -> we can kill them just fine, but there is no upstream issue causing "complete". with issue347 we know what is that root cause and we know that it will not be fixed until it lands and land into Prometheus AND Prometheus go released. This will be quite a long time. With complete and duplicates even though they safe to repair they only occurred once Thanos had an issue, so the only explicit manual command should be fixing these, nothing automated.

return false, errors.Wrapf(err, "gather index issues for block %s", pdir)
}

if err = stats.CriticalErr(); err == nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to use if err := ... here, since you're in a different scope. Same for a few more if conditions below.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also err != nil?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ups

for _, bid := range whilelistIDs {
if id.Compare(bid) == 0 {
found = true
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return true here and get rid of found variable

ExactSum int
if i.OutOfOrderSeries > 0 {
errMsg += fmt.Sprintf(
"%d/%d series have an average of %.3f out-of-order chunks. "+

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the sentence should be "x, y, z" as opposed to "x. y. z". You should probably have errMsg []string and join it with ", " in the end.

repl = append(repl, c)
continue
if ignore {
break

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would labelling outer loop with OUTER_LOOP: and continue OUTER_LOOP work here?

return false, nil
}

if curr.MinTime > prev.MaxTime {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you have curr.Mintime = prev.MaxTime?

Copy link
Member Author

@bwplotka bwplotka Jun 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. Chunks never overlaps. Blocks overlaps with miliesecond always but means [t1, t2) (TSDB known inconsistency). But good question.

)

for _, ignoreChkFn := range ignoreChkFns {
ignore, err = ignoreChkFn(mint, maxt, last, &c)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some ignore check functions will fail if last is nil, is this ok?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - it is all handled.

@@ -532,35 +534,35 @@ func (cg *Group) areBlocksOverlapping(include *block.Meta, excludeDirs ...string
return nil
}

func (cg *Group) compact(ctx context.Context, dir string, comp tsdb.Compactor) (compID ulid.ULID, err error) {
func (cg *Group) compact(ctx context.Context, dir string, comp tsdb.Compactor) (nothingToCompact bool, err error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't seem to be utilizing named returns

return false, errors.Wrapf(err, "gather index issues for block %s", pdir)
}

if err = stats.CriticalErr(); err == nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also err != nil?

return false, errors.Wrapf(err, "deleting old block %s failed. You need to delete this block manually", id)
}

return false, nil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we continue with execution here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this return as we would want to continue with the rest of the pdir := range plan

Copy link
Contributor

@domgreen domgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, i think it would be good to make the Compact section of code more clear ... also changes as requested by @BenoitKnecht + @stefan-improbable


idMatcher = func(id ulid.ULID) bool {
found := false
for _, bid := range whilelistIDs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought - would it be better to do this using a map? map[ulid.ULID]struct{} then see if the ulid is present in the map? Not a big deal anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you cannot simply do map[ulid.ULID]struct{} (ulid.ULID is a byte slice) but we cannot do map[string] - good point!

// We keep going through the outer loop until no group has any work left.
if id != (ulid.ULID{}) {
if !nothingToCompact {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - !nothingToCompact reads weird would prefer !completed or !compactionCompleted

@@ -440,23 +442,23 @@ func (cg *Group) Resolution() int64 {

// Compact plans and runs a single compaction against the group. The compacted result
// is uploaded into the bucket the blocks were retrieved from.
func (cg *Group) Compact(ctx context.Context, dir string, comp tsdb.Compactor) (ulid.ULID, error) {
func (cg *Group) Compact(ctx context.Context, dir string, comp tsdb.Compactor) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add to comment what the bool represents. Will returns true if you should continue compaction on subsequent groups.

if err != nil {
cg.compactionFailures.Inc()
}
cg.compactions.Inc()

return id, err
return continueCompact, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abit confused here ... continueCompaction here turns into nothingToCompact in thanos/compact.go and in the inner compact below you return nothingToCompact as true if its done ... so this then gets assinged to continueCompact ... maybe im missing something but the vars are abit confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just change it all to continueCompact? Think this would read a lot better and help keep it simple.

}

if err = stats.Issue347OutsideChunksErr(); err != nil {
level.Info(cg.logger).Log("msg", "stopping compaction and repairing invalid plan block", "err", err.Error(), "broken_block_id", pdir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - sentence

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually looks like this meets convention so ignore 👍

return false, errors.Wrapf(err, "deleting old block %s failed. You need to delete this block manually", id)
}

return false, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this return as we would want to continue with the rest of the pdir := range plan

@bwplotka bwplotka force-pushed the issue354-outsider-handling branch 2 times, most recently from 90a2b8d to 90228b3 Compare June 14, 2018 10:47
@bwplotka
Copy link
Member Author

Thanks all for the detailed review! Apologise for so many nits.. That always happens on 2 AM PRs ^^

Addressed all PTAL again @BenoitKnecht @stefan-improbable @domgreen

On green CI, before landing, I am going to test the fixed compaction flow on our testing cluster.

// Issue347OutsideChunksErr returns error if stats indicates issue347 block issue, that is repaired explicitly before compaction (on plan block).
func (i Stats) Issue347OutsideChunksErr() error {
if i.Issue347OutsideChunks > 0 {
return errors.Errorf("Found %d chunks outside the block time range introduced by https://github.com/prometheus/tsdb/issues/347", i.Issue347OutsideChunks)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: found with small f in the beginning due to style guide

var errMsg string

if err := i.CriticalErr(); err != nil {
errMsg += err.Error()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use strings.join ", " similarly to CriticalErr

@stefan-improbable
Copy link

@Bplotka is it possible to unit test compaction and the block error detection/repair logic?

}
return errors.Wrap(err, "compaction")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change this section so we do not indent the normal flow?

					id, err := g.Compact(ctx, compactDir, comp)
					if err != nil {
						if compact.IsIssue347Error(err) {
							err = compact.RepairIssue347(ctx, logger, bkt, err)
							if err == nil {
								done = false
								continue
							}
						}
						return errors.Wrap(err, "compaction")
					}

					// If the returned ID has a zero value, the group had no blocks to be compacted.
					// We keep going through the outer loop until no group has any work left.
					if id != (ulid.ULID{}) {
						done = false
					}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but that will increase the indentation deepness. It's a tradeoff. What do you prefer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer less indentation on standard flow but it is not a blocker, just a preference.

@bwplotka
Copy link
Member Author

bwplotka commented Jun 14, 2018

@stefan-improbable yes, and we have these.

I added compaction upstream test though: prometheus-junkyard/tsdb#349 (we use vanilla TSDB compactor internally)

@bwplotka
Copy link
Member Author

@stefan-improbable actually we have even e2e tests, sorry forgot abou these: https://github.com/improbable-eng/thanos/blob/master/pkg/compact/compact_e2e_test.go

@bwplotka
Copy link
Member Author

PTAL again -> Added notion of ThanosMeta.Source

@bwplotka bwplotka force-pushed the issue354-outsider-handling branch 2 times, most recently from f35be41 to 76c8191 Compare June 14, 2018 15:46
Copy link
Contributor

@domgreen domgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@domgreen
Copy link
Contributor

I like the addition of ThanosMeta.Source 👍

@bwplotka
Copy link
Member Author

I am going to leave it for a day on our beta cluster, but seems to be good to go 👍

IndexFilename = "index"
// ChunksDirname is the known dir name for chunks with compressed samples.
ChunksDirname = "chunks"
// Real upload source of the block.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Source is ..., comment should start with object name due to govet or go lint conventions.

ExactSum int
if i.OutOfOrderSeries > 0 {
errMsg = append(errMsg, fmt.Sprintf(
"%d/%d series have an average of %.3f out-of-order chunks. "+

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: comment should be "%d/%d series have an ... chunks: %.3f of these ... time range)" (replace first period with colon and don't sentence in period so that Join(", ") works properly.

}

if len(errMsg) > 0 {
return errors.New(strings.Join(errMsg, ","))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: replace "," with ", "

CompleteOutsiders int
n := i.OutsideChunks - (i.CompleteOutsideChunks + i.Issue347OutsideChunks)
if n > 0 {
errMsg = append(errMsg, fmt.Sprintf("found %d chunks non-completely outside the block time range.", n))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no period at the end of error message

}

if i.CompleteOutsideChunks > 0 {
errMsg = append(errMsg, fmt.Sprintf("found %d chunks completely outside the block time range.", i.CompleteOutsideChunks))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no period at the end of error message

return errors.Errorf("No chunks are out of order, but found some outsider blocks. (Blocks that outside of block time range): %d. Complete: %d",
i.Outsiders, i.CompleteOutsiders)
if len(errMsg) > 0 {
return errors.New(strings.Join(errMsg, ","))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: replace "," with ", "

cb := crc32.Checksum(curr.Chunk.Bytes(), castagnoli)

if ca != cb {
return false, errors.Errorf("non-sequential chunks not equal: %x and %x", ca, cb)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printing ca and cb is not useful (or is it)?. Printing (last.MinTime, last.MaxTime), (curr.MinTime, curr.MaxTime) is probably better. Or is there a way to indentify bad chunks so that users can debug them further?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is enough. You have block ID so you can go through chunks and see on your own.

// - compactor created blocks
// NOTE: It is not safe to miss "old" block (even that it is newly created) in sync step. Compactor needs to aware of ALL old blocks.
// TODO(bplotka): https://github.com/improbable-eng/thanos/issues/377
if ulid.Now()-id.Time() < uint64(c.syncDelay/time.Millisecond) &&

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you use time.Now().UTC()?

Copy link
Member Author

@bwplotka bwplotka Jun 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of what? We want to compare ulids which are not same as Time.Now() (they are not that precise)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant using time.Now().UTC() - id.Time(). The functions that "conveniently" call another functions are just introducing an unnecessary indirection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree here. time.Now().UTC() will not work.

we would endup doing ulid.Timestamp(time.Now().UTC()) which is rewritting function that is exactly for this. so would vote for leaving ulid.Now()

// avoid races when a block is only partially uploaded. This relates only to level 1 blocks.
// NOTE: It is not safe to miss compacted block in sync step. Compactor needs to aware of ALL old blocks.
if meta.Compaction.Level == 1 && ulid.Now()-id.Time() < uint64(c.syncDelay/time.Millisecond) {
// avoid races when a block is only partially uploaded. This relates to all blocks despite:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: despite is probably incorrect word here

TSDB issue: prometheus-junkyard/tsdb#347

How we handle it?
- segregate this issue in special stat entry in verifier.
- auto-fix broken plan block before thanos compaction.
- adding repair job to run offline batch repair for indivdual blocks.

NOTE: At this point we have no power of fixing the bug when someone uses local compaction ):

Fixes #354

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
@bwplotka bwplotka merged commit 5b66b72 into master Jun 15, 2018
@bwplotka bwplotka deleted the issue354-outsider-handling branch December 11, 2018 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

compact: fails with "invalid plan block" error
4 participants